A global-ranking local feature selection method for text categorization
نویسندگان
چکیده
In this paper, we propose a filtering method for feature selection called ALOFT (At Least One FeaTure). The proposed method focuses on specific characteristics of text categorization domain. Also, it ensures that every document in the training set is represented by at least one feature and the number of selected features is determined in a data-driven way. We compare the effectiveness of the proposed method with the Variable Ranking method using three text categorization benchmarks (Reuters-21578, 20 Newsgroup and WebKB), two different classifiers (k-Nearest Neighbor and Naïve Bayes) and five feature evaluation functions. The experiments show that ALOFT obtains equivalent or better results than the classical Variable Ranking. 2012 Elsevier Ltd. All rights reserved.
منابع مشابه
A General Investigation on the Combination of Local and Global Feature Selection Methods for Request Identification in Telegram
Nowadays, the use of various messaging services is expanding worldwide with the rapid development of Internet technologies. Telegram is a cloud-based open-source text messaging service. According to the US Securities and Exchange Commission and based on the statistics given for October 2019 to present, 300 million people worldwide used telegram per month. Telegram users are more concentrated in...
متن کاملImproving the Operation of Text Categorization Systems with Selecting Proper Features Based on PSO-LA
With the explosive growth in amount of information, it is highly required to utilize tools and methods in order to search, filter and manage resources. One of the major problems in text classification relates to the high dimensional feature spaces. Therefore, the main goal of text classification is to reduce the dimensionality of features space. There are many feature selection methods. However...
متن کاملData-driven global-ranking local feature selection methods for text categorization
http://dx.doi.org/10.1016/j.eswa.2014.10.011 0957-4174/ 2014 Elsevier Ltd. All rights reserved. ⇑ Corresponding author. Tel.: +55 81 2126 8430x4346; fax: +55 81 2126 8438. E-mail addresses: [email protected] (R.H.W. Pinheiro), [email protected] (G.D.C. Cavalcanti), [email protected] (T.I. Ren). URL: http://www.cin.ufpe.br/~viisar (G.D.C. Cavalcanti). Roberto H.W. Pinheiro, George D.C. Cavalcanti ⇑,...
متن کاملA Hybrid Feature Selection Approach for Arabic Documents Classification
Text Categorization (classification) is the process of classifying documents into a predefined set of categories based on their content. Text categorization algorithms usually represent documents as bags of words and consequently have to deal with huge number of features. Feature selection tries to find a set of relevant terms to improve both efficiency and generalization. There are two main ap...
متن کاملAn Examination of Feature Selection Frameworks in Text Categorization
Feature selection, an important task in text categorization, is used for the purpose of dimensionality reduction. Feature selection basically can be performed locally and globally. For local selection, distinct feature sets are derived from different classes. The number of feature set is thus depended on the number of class. In contrary, only one universal feature set will be used in global fea...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Expert Syst. Appl.
دوره 39 شماره
صفحات -
تاریخ انتشار 2012